Improving Language Modeling using Densely Connected Recurrent Neural Networks

نویسندگان

  • Fréderic Godin
  • Joni Dambre
  • Wesley De Neve
چکیده

In this paper, we introduce the novel concept of densely connected layers into recurrent neural networks. We evaluate our proposed architecture on the Penn Treebank language modeling task. We show that we can obtain similar perplexity scores with six times fewer parameters compared to a standard stacked 2layer LSTM model trained with dropout (Zaremba et al., 2014). In contrast with the current usage of skip connections, we show that densely connecting only a few stacked layers with skip connections already yields significant perplexity reductions.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Translation Modeling with Bidirectional Recurrent Neural Networks

This work presents two different translation models using recurrent neural networks. The first one is a word-based approach using word alignments. Second, we present phrase-based translation models that are more consistent with phrasebased decoding. Moreover, we introduce bidirectional recurrent neural models to the problem of machine translation, allowing us to use the full source sentence in ...

متن کامل

Syntactically Informed Text Compression with Recurrent Neural Networks

We present a self-contained system for constructing natural language models for use in text compression. Our system improves upon previous neural network based models by utilizing recent advances in syntactic parsing – Google’s SyntaxNet – to augment character-level recurrent neural networks. RNNs have proven exceptional in modeling sequence data such as text, as their architecture allows for m...

متن کامل

Improving English Conversational Telephone Speech Recognition

The goal of this work is to build a state-of-the-art English conversational telephone speech recognition system. We investigated several techniques to improve acoustic modeling, namely speaker-dependent bottleneck features, deep Bidirectional Long Short-Term Memory (BLSTM) recurrent neural networks, data augmentation and score fusion of DNN and BLSTM models. Training set consisted of the 300 ho...

متن کامل

Sequence Modeling with Recurrent Tensor Networks

We introduce the recurrent tensor network, a recurrent neural network model that replaces the matrix-vector multiplications of a standard recurrent neural network with bilinear tensor products. We compare its performance against networks that employ long short-term memory (LSTM) networks. Our results demonstrate that using tensors to capture the interactions between network inputs and history c...

متن کامل

On End-to-End Program Generation from User Intention by Deep Neural Networks

This paper envisions an end-to-end program generation scenario using recurrent neural networks (RNNs): Users can express their intention in natural language; an RNN then automatically generates corresponding code in a characterby-character fashion. We demonstrate its feasibility through a case study and empirical analysis. To fully make such technique useful in practice, we also point out sever...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017